Effective Pruning for the Discovery of Conditional Functional Dependencies

نویسندگان

  • Jiuyong Li
  • Jixue Liu
  • Hannu Toivonen
  • Jianming Yong
چکیده

Conditional Functional Dependencies (CFDs) have been proposed as a new type of semantic rules extended from traditional functional dependencies. They have shown great potential for detecting and repairing inconsistent data. Constant CFDs are 100% confidence association rules. The theoretical search space for the minimal set of CFDs is the set of minimal generators and their closures in data. This search space has been used in the currently most efficient constant CFD discovery algorithm. In this paper, we propose pruning criteria to further prune the theoretic search space, and design a fast algorithm for constant CFD discovery. We evaluate the proposed algorithm on a number of medium to large real world data sets. The proposed algorithm is faster than the currently most efficient constant CFD discovery algorithm, and has linear time performance in the size of a data set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discover Dependencies from Data - A Review

Functional and inclusion dependency discovery is important to knowledge discovery, database semantics analysis, database design, and data quality assessment. Motivated by the importance of dependency discovery, this paper reviews the methods for functional dependency, conditional functional dependency, approximate functional dependency and inclusion dependency discovery in relational databases ...

متن کامل

Discovering Conditional Functional Dependencies in XML Data

XML data inconsistency has become a serious problem since XML was widely adopted as a standard for data representation on the web. XML-based standards such as OASIS, xCBL and xBRL have been used to report and exchange business and financial information. Such standards focus on technical rather than semantic aspects. XML Functional Dependencies (XFDs) have been introduced to improve XML semantic...

متن کامل

Automatic Discovery of Functional Dependencies and Conditional Functional Dependencies: A Comparative Study

Over the last twenty years, several algorithms have been proposed for automatic rule/constraint discovery from data, for the purpose of data cleaning. These algorithms look for constraints such as functional dependencies (FDs), conditional FDs (CFDs), inclusion dependencies (INDs), conditional INDs (CINDs), association rules, integrity constraints (ICs) and denial constraints (DCs), among other...

متن کامل

Mining Approximate Functional Dependencies as Condensed Representations of Association Rules

Approximate Functional Dependencies (AFD) mined from database relations represent potentially interesting patterns and have proven to be useful for various tasks like feature selection for classification, query optimization and query rewriting. Though the discovery of Functional Dependencies (FDs) from a relational database is a well studied problem, the discovery of AFDs still remains under ex...

متن کامل

Data Dependencies Mining In Database by Removing Equivalent Attributes

Abstract-data Dependency plays a key role in database normalization, which is a systematic process of verifying database design to ensure the nonexistence of undesirable characteristics. Bad design could incur insertion, update, and deletion anomalies that are the major cause of database inconsistency [1, 2]. The discovery of Data Dependency from databases has recently become a significant rese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Comput. J.

دوره 56  شماره 

صفحات  -

تاریخ انتشار 2013